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ABSTRACT 

The fact that the clustering of dark matter halos depends not only on their mass, but also the formation 
epoch, is a prominent, albeit subtle, feature of the cold dark matter structure formation theory, and is known 
as assembly bias. At low mass scales (~ IO'^/l'A/q), early-forming halos are predicted to be more strongly 
clustered than the late-forming ones. In this study we aim to robustly detect the signature of assembly bias 
observationally, making use of formation time indicators of central galaxies in low mass halos as a proxy for 
the halo formation history. Weak gravitational lensing is employed to ensure our early- and late-forming halo 
samples have similar masses, and are free of contamination of satellites from more massive halos. Lor the two 
formation time indicators used (resolved star formation history and current specific star formation rate), we 
do not find convincing evidence of assembly bias. Lor a pair of early- and late-forming galaxy samples with 
mean mass M200C ~ 9 x lO"/r'M0, the relative bias is 1.00±0.12. We attribute the lack of detection to the 
possibilities that either the current measurements of these indicators are too noisy, or they do not correlate well 
with the halo formation history. Alternative proxies for the halo formation history that should perform better 
are suggested for future studies. 

Subject headings: cosmology; large-scale structure of Universe, galaxies; formation, galaxies; haloes 


1. INTRODUCTION 

In the cold dark matter (CDM) structure formation 
paradigm, the spatial distribution and internal structure of 
dark matter halos depend not only on their mass, b ut also the 
formation time, an effect known as assembly bias (iGao et al.l 
120051 ICroton et al.l120071) . Lor halos of mass below or close 
to the nonlinear mass scale M„\, those that form earlier would 
cluster more strongly and be more concentrated, while those 
that form later would be less c lustered and concentrated 
(iGao et al.l 120051 IZhu et alll2006h . Lor much more massive 
halos, the situation reverses and the later forming ones actu¬ 
ally cluster more strongly, altho ugh the effect is weaker com- 


pared to the low mass casd^i ( 
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As the halo assembly bias is a very distinct feature of 
the CDM theory of structure formatio n, several g r oups h ave 
looked for its observational evidence. lYang et ^ (120061) are 
the first to claim the detection of the effect. Using a galaxy 
group catalog that extends to low mass systems with just one 
or two galaxies, they find that for halos of similar masses, 
those with a central galaxy that has a low star formation rate 
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(SLR) cluster more strongly than those with a high SLR cen¬ 
tral galaxy. If the SLR of central galaxies correlates well 
with the formation history of the halos in such a way that 
lower SLR corresponds to earlier formation epoch, then the 
observed clustering properties would be a manifestation of as¬ 
sembly bias. 

The way halo mass is estimated in the lYang et ^ (l2006h 
group catalog is in spirit similaJ^ to the abundance matching 
technique (e.g., I&avtsov et al.ll2004t iTasitsiomi et al.l l2004t 
IConrov et al.l 2006h . that is. a one-to-one correspondence be¬ 
tween the total luminosity (or stellar mass) content of a galac - 
tic system and its total mass is assumed (lYang et al.l 120051) . 
Therefore, at the low halo mass end the halo mass ranking 
is equivalent to the luminosity (or stellar mass) ranking of the 
central galaxies. It is possible that galaxy formation processes 
would induce large scatter between the luminosity (or stellar 
mass) of the central and the host halo mass, and that the mean 
relationships between galaxy and host halo properties may be 
different for early- and late-type (or red and blu e) galaxies. 
Some subsequent studies of assembly bias (e.g., IWang et HI 
I2OI3I; lUacerna et ^120141) als o employ the later versions of 
the group catalog generated by lYang et al.l (120061) . so if there 
are systematic biases in the halo mass estimates due to the 
assumptions made, then the later stu dies would be subject to 
those biases as well (see also Sec tionl5.3b. 

Two key assumptions in the lYang et ^ (120061) analysis 
are that the current SLR of central galaxies is a good in¬ 
dicator of its formation history, and that the formation his¬ 
tory of a central is int imately linked to that of the host 
halo. IWang et aH (120131) have examined results from semi- 
analytic galaxy formation models, and found a correlation be¬ 
tween the current specific SLR (sSLR, star formation rate per 
unit stellar mass) and the formation epoch, which indirectly 
supports the former assumption. Lurthermore, in the age- 

** Abundance-matching methods that are used in practice include ad¬ 
ditional complexity, such as scatter between halo properties and observed 
galaxy properties. 
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matching framework (iHearin & WatsonI 1201 3t iHearin et alJ 
1201 4t IWatson et al.ll2015l) . which is an extension of the sub¬ 
halo abundance matching technique and is capable of re¬ 
producing the color-dependence of galaxy clustering, the 
present-day color or sSFR of galaxies is assumed to directly 
correspond to the (sub)halo formation time (however see 
iMasaki et al.ll2013l for a different point of view). Therefore, 
both of these assumptions seem to be well founded in theoret¬ 
ical/phenomenological models of galaxy formation (see also 
Section ISTt . and form the foundation for studying assembly 
bias with galaxy populations. 

Instead of using the sSFR, iLacerna et al.l (I2014^ consider 
the luminosity weighted-age of the central galaxies as a proxy 
of the halo formation epoch. They find that groups with an 
older central galaxy cluster more strongly than those with a 
younger central galaxy, in line with the prediction of assembly 
bias. 

Here we present an observational study of the assembly bias 
that is distinct from previous investigations in several ways. 
Although we also employ the group catalog from Yang and 
collaborators, and assume that the formation history of halos 
can be inferred from the properties of the central galaxies, we 
do not rely on the halo mass estimates given in the catalog. 
Instead, we use galaxy-galaxy lensing to ensure our early- and 
late-forming halo samples are of similar masses. Second, in 
addition to sSFR, we also consider the temporally resolved 
star formation history (SFH) of central galaxies to estimate 
the formation time of the host halos. Third, as any presence 
of satellites in a central galaxy sample would inevitably bias 
the halo mass and the large scale clustering measurements, 
they have to be removed from the sample. We have taken care 
of such contaminants in our analysis. Fourth, it is difficult 
to infer the halo mass probability distribution for an observed 
galaxy sample. We have devised a way to take into account 
such uncertainties when we compare our data with theoretical 
expectations. 

It is possible to extend the concept of assembly bias from 
halos to galaxies, such that for galaxies of the same stellar 
mass, those that fo r m earlier would cl u ster more strongl y 
(e.g., iLi et al.l 120061 ICooper et al.l 1201 (j IWang et ^l2013h . 
We shall refer to this phenomenon as galactic assembly bias, 
and note that in the limit that the stellar mass of a galaxy is 
a perfect proxy for its dark matter halo mass, the two types 
of assembly bias are manifestations of the same phenomenon. 
However, both the intrinsic scatter in the central galaxy stellar 
mass-halo mass relation and the presence of satellite galaxies 
in the galaxy samples may obscure the interpretation of galac¬ 
tic assembly bias. In this study we thus focus on detecting and 
quantifying the magnitude of halo assembly bias from central 
galaxies of low mass halos. 

It is also possible to generalize the assembly bias to refer to 
dependence of clustering and internal structure on any halo 
parameters other than the mass ( e.g., IWechsler et al.l 12005 
iLi et al.ir2008HZentner et al.ll2014ft . Throughout this work we 
follow the original definition and only investigate the effect of 
halo formation time on the clustering properties. 

Detecting and quantifying the magnitude of assembly bias 
will have far reaching impact on galaxy formation studies, es¬ 
pecially for a class of phenomenological model known as the 
halo o ccupation distribution (HOD; e.g.. lBerlind & WeinbergI 
l2002h . In the standard HOD framework, the halo mass is as¬ 
sumed to be the only governing factor controlling galaxy for¬ 
mation. Blindly inferring the relationship between galaxies 
and halos from the galaxy clustering properties without ac¬ 


counting for the effect of assembly bias will l ead to biased 
results, as demonstrated bv IZentner et alJ (120141) . 

The structure of this paper is as follows. In Section |2l we 
describe our data, including the galaxy and group catalogs, the 
measurements of sSFR and SFH, two-point correlation func¬ 
tion, the stacked lensing signals, and the simulations we use to 
derive theoretical expectations. In Section [3 we repeat some 
of the previous attempts at detecting the assembly bias, show¬ 
ing that most of the claimed signal can actually be attributed to 
halo mass differences. Then in Section|4]we construct galaxy 
samples that can be used most robustly (as allowed by the 
quality of our data) to measure the assembly bias. We discuss 
caveats and implications of our analysis and future directions 
in Section|5] and summarize our results in Sec tion| 6 l _ 

Thr oughout this paper we adopt a WMAP5 (iKomatsu et al.l 
l2009h ACDM cosmological model, where = 0.26, Ha = 
0.74, Hq = lOO/r kms“'Mpc“^ with h = 0.71, erg = 0.8 and 
Mni = 2.7 X lO'^/r'M 0 at z = 0. Unless otherwise noted, the 
halo mass definition we adopt is M 200 C, the mass enclosed in 
^ 200 c, within which the mean overdensity is 200 times the crit¬ 
ical density of the Universe penY at the redshift of the halo. 
Another popular mass definition, defined analogously 

to M 200 C but with respect to the mea n density of the Universe, 
will be used in parts of Section l4n 

2. KEY ELEMENTS OF ANALYSIS 
2.1. Catalog of Central Galaxies 

Similar to previous observational studies of assembly bias, 
we assume that if the formation history of galaxies in a halo 
could somehow be related to that of the host halo, it is the cen¬ 
tral galaxy that traces best the halo formation history. There¬ 
fore, the starting point of our analysis is a catalog of cen- 
tral galaxies, for which we use the galaxy group catalog of 
lYang et ^ (120071 hereafter Y07). The vers ion of the cata¬ 
log us ed is based on data release seven (DR7: lAbazaiian et al. 
2009) of the Sloan Digital Sky Survey (SPSS: lYork et al 
2000h . The Y07 group finding algorithm is essentially an 
adaptive matched filter applied to the spectroscopic sample 
of SPSS. Its unique features include (1) the identification of 
groups of all richness, ranging from massive clusters down to 
systems with just one member, ( 2 ) designation of central and 
satellite galaxies among the group members, (3) assignment 
of total mass of the groups via an abundance matching method 
(see Section [11 in this study we adopt the mass based on the 
luminosity content-ranking). The DR7 group cata log contains 
374,0 52 systemQ over 7,748 deg“ out to z = 0.2 (IWang et alJ 

l 2 (iTl . 

After selecting the galaxies identified as centrals from the 
Y07 group catalog, we seek measurements of sSFR and SFH 
from public data sets. For the former we adopt the measure¬ 
ments from the DR7 version of the MP A/JHU value-added 
galaxy catalog (iBrinchmann et al.l 120041) . which are derived 
from emission lines as well as the strength of the 4000 A 
break. 

For the resolved SFH, we u tilize the results of the VESPA 
algorithm (iTojeiro et al.ll200^ . which are available for query 
in the WFCAM science archive. In short, VESPA considers 
combinations of bursts of star formation of different metal- 
licity, and chooses the one that best fits the observed SDSS 

Specifically, we use their sample III. Note that there are 472,416 groups 
in the DR7 catalog, but we only use those that have a halo mass assignment 
and a spectrum measured for the central galaxy. 
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spectrum. For each galaxy, it provides the stellar mass formed 
in 16 temporal (age) bins, where the first bin spans the pe¬ 
riod of 9 to 14 Gyr before the redshift of the galaxy (and 
other bins are closer to the present epoch). In addition to 
usin g two of the popuDr ste l lar population s ynthesis mod¬ 
els (iBruzual & Charlol 120031 iMarastoiil I2005h to model the 
spectrophotometric evolution of the galaxies, two different 
dust extinction models are also considered in V ESPA. There¬ 
fore, for each galaxy, four SFHs are provided (iToieiro et al.l 
20091). For simplicity, we adopt the SFHs based on the 
Bruzual & Charlo j (I2003h model, and demand the consistency 
between the SFHs derived from the two dust models (that is, 
a galaxy should be classified as early- or late-forming in both 
models). 

In the following, unless otherwise stated, an early-forming 
galaxy is defined to have formed 50% of its final stellar mass 
in the first temporal bin, or roughly by z = 1.9 for a galaxy 
observed at z = 0 . 1 , which is close to the mean redshift of 
the samples we use in Section 01 In contrast, a late-forming 
galaxy would have formed 50% of its final mass after the first 
temporal bin (or at z < 1.9 if it is observed at z = 0.1). 


2 . 2 . Correlation Function Measurements 

To measure the large-scale bias of the galaxy samples, we 
first calculate the redshift-space t wo-point auto-correlat ion 
function via the standard estimator (iFandv & Szalavl[l993h 


^ _ {DD)-2{DR) + {RR) 

{^) 


( 1 ) 


where {DD), {DR), and {RR) are the normalized numbers of 
data-data, data-random, and random-random pairs in a given 
separation bin, respectively. In practice, to bypass complica¬ 
tions due to redshift space distortions, we measure in two 
dimensions, both perpendicular to and along the line of sight 
(denoted as rp- and 7r-directions, respectively), then compute 
the projected correlation function by integrating over the tt 
direction 

pTTup 

Wp{rp) = 2 / d7r^s(rp,7r). ( 2 ) 

Jo 


Following common practice, the integration upper limit is 
chosen to be TTup = 60/z“' Mpc, so that the results are not af- 
fected by redshift space di stortion below rp ~ 30/2“'Mpc (e.g., 
iPadmanabhan et al]l2007^ . As will be described below, our 
galaxy samples have sizes of several tens of thousands. We 
therefore use one million random points over the DR7 foot¬ 
print, generated with t he mask provided b y the NYU value- 
added galaxy catalog (iBlanton et al.ll20()5h . The covariance 
matrix of the correlation function is calculated using the jack¬ 
knife resampling method. We have divided the DR7 footprint 
into 100 equal-area subregions, and have constructed the jack¬ 
knife samples by omitting each of the subregions in turn. The 
covariance matrix is then estimated as 


N- 1 

C(Wpj,Wpj) = 

e=i 


where N = 100, i is the index of the jackknife samples, i, j are 
indices of rp bins, and is the mean from all the jackknife 
samples. Note that N is much larger than the number of ra¬ 
dial bins in ou r correlation funct ion measurements (see, e.g., 
appendix D of lHirata et al.ll20(M which includes a method of 
simulating the impact of noise on the jackknife-based that 


we use here to estimate p-values), and the size of the jack¬ 
knife subregions is much larger than the maximum size of 
the clustering used for the analysis (< 35/z“'Mpc), so that the 
samples nearly satisfy the i.i.d. (independent and identically- 
distributed) requirement. These conditions validate the use of 
jackknife resampling. 


2.3. Galaxy-Galaxy Leasing Measurements 

Galaxy-galaxy weak lensing, the coherent tangential shear 
of background galaxies due to foreground lens galaxies, pro¬ 
vides a simple way to probe the connection between the lens 
galaxies and matter via their cross-correlation function 
This cross-correlation can be related to the projected surface 
densitjO via 


E(rp) = p 




diT. 


(4) 


The surface density is then related to the observable quantity 
for lensing. 


AS(r^) = 7 ,(rp)S, = S« rp)-S(rp). (5) 

This observable quantity can be expressed as the product of 
two factors, a tangential shear 7 ? and a geometric factor 




Ds 


47rGZ)z,D„(l+ZL)2 


( 6 ) 


where Di and Ds are angular diameter distances to the lens 
and source, Dis is the angular diameter distance between the 
lens and source, and the factor of (1 -I-zl)~^ arises due to our 
use of comoving coordinates. 

As sources t o measure the galax y-galaxy lensing signal, we 
use a catalog (iReves et al.l 120121) of 1.2 background galax¬ 
ies per arcmin^ with weak lens ing shears estimated using 
the re-Gaussianization method (iHirata & Sell aid 120031) and 
photometric redshifts from Zurich Extragalactic B ayesian 
Redshift Analyzer (ZEBRA, iFeldmann et al.l I2006h . The 
catalog is charac t erized in detail in several papers (see 
Reves et d1 120121: iMandelbaum et akl 120121 : iNakajima et alJ 
2012 HMandelbaum et alJl2013l). 

The lensing measurement begins with identification of 
background galaxies around each lens (with photometric red¬ 
shift larger than the lens spectroscopic redshift). Inverse vari¬ 
ance weights are assigned to each lens-source pair, including 
both shape noise and measurement error terms in the variance: 


Wh = 




(7) 


where cr^ is the shape measurement error due to pixel noise, 
and (j|n is the root-mean-square intrinsic ellipticity (both 
quantities are per com ponent, rather than total; the latter is 
fixed to 0.365 following iReves et alJ2012h . Use of photomet¬ 
ric redshifts, which have nonzero bias and significant scat¬ 
ter, gives rise to a bias in the signals that can be eas ily cor¬ 
rected using the method from iNakaiima et^ (|20T2)- This 
bias is a function of lens redshift, and is calculated including 
all weight factors for each lens sample taking into account its 
redshift distribution. For typical low redshift lens samples in 
this work, the bias is of order 1 per cent, far below the statis¬ 
tical errors. 


Here we neglect the very broad radial window function. 
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AE in each radial bin can be computed via a summation 
over lens-source pairs “Is” and random lens-source pairs “rs”: 


AE(r„) = 


X;/^vv;^ef^’Ecrit(z;,Zi) 




( 8 ) 


where is the comoving projected radius from the lens, e, is 
the tangential ellipticity component of source galaxy with re- 
spe ct to the lens position, TZ (1-e^^^) is the shear responsiv- 
ity (iBernstein & Jarvisll2002h that converts from the ensemble 
average distortion to shear, and eims is the root-mean-square 
distortion per component. The division by 'Y^Wrs accounts 
for the fact that some of our “sources” are physically associ- 
ated with the lens, and therefore not lensed by it (see, e.g., 
ISheldon et al.ll20d^ . Finally, we subtract off a similar signal 
measured around random points with the same area cover¬ 
age and redshift distribution as the lenses, to subtract off any 
cohere nt systematic shear contributions (iMandelbaum et al] 
I20051j) : this signal is statistically consistent with zero for all 
scales used in this work. 

To calculate the error bars, which are dominated by shape 
noise, we use the jackknife resampling method. The maxi¬ 
mum scale used for the fits in the lensing analysis is l/!“'Mpc, 
which for a typical lens redshift is far below the typical size of 
each jackknife resampled region. Thus, the jackknife method 
is a reasonable approach to getting the covariance matrix for 
the projected mass profile. 

All of the projected mass around lens galaxies contributes 
to the galaxy-galaxy lensing signal. This includes contribu¬ 
tions from the host dark matter halo in which the lens galaxy 
resides (“l-halo term”), and from other dark matter halos 
(“two-halo term”) that are part of large-scale structure associ¬ 
ated with the lens. For central galaxies, the 1 -halo term simply 
corresponds to the AE for the host dark matter halo. For satel¬ 
lite galaxies, there are two contributions to the 1-halo term: on 
small scales, a contribution from the satellite subhalo, and on 
larger scales (0. 3-2/i~*Mpc) a contribution from the host halo 
itself. See, e.g.. IMandelbaum et alJ(l2005a^ for illustrations of 
these contributions to the galaxy-galaxy lensing signal. The 
distinctive shape of the satellite contribution in the lensing 
signal will be a diagnostic of possible contamination of our 
“central galaxy” sample by satellites in this work. 

W e fit the observed len sing profiles to the prediction for a 
pure lNavarro et al.] (119971 hereafter NFW) profile. The NFW 
profile in principle has two parameters: the concentration and 
the total mass within some fiducial radius, for which we use 
a spherical overdensity of 200pcr;7- We fix the relationship 
between c oncentration cpon and mass M ?,nnr using a fitting for¬ 
mula from iDiemer & KravtsovI (120151) . 

For a given mass (and therefore concentration), the AE is 
predicted via direct integration. Finally, we fit for the mass via 
minimization using the Levenberg-Marquardt algorithm. 
This fit is carried out twice for each sample. In the first case, 
we fit from 0.04 to 0.3/z“*Mpc, and compare the best-fit pro¬ 
file with the observed signal out to larger scales, 1 /r'Mpc. If 
the sample is contaminated by satellites, this will be evident in 
a substantial excess in the observed signal for rp > 0.3 / 2 “*Mpc 
compared to the theoretical one fit to rp < 0.3/2“'Mpc and 
then extrapolated to larger scales. We use this test as a way 
to identify samples that are not truly central galaxies and that 
therefore cannot be used for our analysis (see Section O. If 
we do not see any sign of satellite contamination, then we 
redo the fit using AE from 0.04 to 1 /!“'Mpc (as in Section|4]i. 


As shown directly by IMandelbaum et alJ (12005 a^ using fits to 
lensing signals from simulated galaxy samples based on N- 
body simulations, the best-fitting mass will lie between the 
true median and mean halo mass for the sample, and is most 
easily interpreted for samples with relatively narrow mass dis¬ 
tributions. 


2.4. Numerical Simulations 

To compare our measurements of assembly bias against the¬ 
oretical expectations, we use three sets of cosmological sim¬ 
ulations. The first one is a subset of A -body simulations pre¬ 
sented in iDiemer & Kravtsc^ (12014 specifically the L0250, 
L0500, and LIOOO boxes). Each of these simulations fol¬ 
lows 1024^ dark matter particles. Combined with the box 
sizes of 250, 500, and 1000/!“^Mpc on a side, the result¬ 
ing particle mass resolutions are 1.1 x 10^, 8.7 x 10®, and 
7 X 10'®respectively. The force resolution lengths 
are 5.8, 14 and 33 comoving / 2 “*kpc, respectively. The cos¬ 
mological parameters used in these simulations (fl,„ = 1 - 
Oa = 0.27, h = 0.7, erg = 0.82, M„i = 3.2 x IO'^/i-'Mq at 
z = 0) are close to the ones adopted in the current study. 
Halos and merger tree s were extracted usin g the Rockstar 
(iBehroozi et al.ll2013a^ and consistent-trees (iBehroozi et al.l 
120131j) code s. For more details of the simulations we refer 
the reader to IDiemer & KravtsovI (120141) . 

The main quantity we wish to derive from this set of simu¬ 
lation data is the redshift of formation Zform for the halos, for 
which we consider two commonly used definitions: (1) zgo, 
the epoch when a halo has acquired 50% of its final mass, and 
(2) Zmah, which is obtained from the mass accretion histor y 
of halos, following the prescription of lWechsler et alJ (|200^. 
We fit the mass accretion history of halos by the form 

M(z) oc exp(-Q!z), (9) 

with the Levenberg-Marquardt algorithm for minimization of 
the least squares, and define Zm^h = 2la-l. Since Zmah is de¬ 
termined by the overall merger history, it is less sensitive to 
indivi dual events in halo a ssembly tha n oth er definitions of 
Zform (IWechsler et alJl2()()^ . In Section ItTI we will compare 
the clustering properties of central galaxies selected by their 
SFH with that of halos, selected by either definition of Zform- 

The second set of simulations we use is in the form of 
a mock galaxy c atalog , which is made publicly available 
by IWatson et al.l (12015h and is produced by applying the 
age-m atching model to the Bolshoi simulation (iKlvpin et al.l 
1201 ll) . The Bolshoi simulation box size is the same as that 
of the L0250 simulation mentioned above, but with much 
better mass resolution (1.35 x 10®/2 “*Mq). The cosmology 
adopted by the Bo l shoi s imulation is the same as that of 
IDiemer & Kraytso^ (120141) . 

To create the mock catalog, IWatson et alJ (120151) first use 
the standard abundance matching technique to assign stellar 
mass to the subhalos (by associating stellar mass with maxi¬ 
mum circular yelocity a subhalo has eyer attained). Then, for 
subhalos within a giyen stellar mass range, a sSFR is further 
assigned to each subhalo (following the obseryed distribution 
of sSFR from galaxies of the same stellar mass range), un¬ 
der the simplifying assumption that older subhalos haye lower 
sSFR. Assembly bias is therefore maximally built-in to the 
properties of the mock galaxies yia a monotonic relationship 
between age and sSFR. In the resulting mock catalog that 
contains 207,950 galaxies, we haye information such as the 
position within the simulation box, designation as central or 
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Figure 1. R atio of early-to-late-forming halo correlation functions from 
simulations of IDiemer & Kravtsov! 4201411 . From bottom to top, each row 
represents a halo mass bin with width of 0.5 dex, in order of increasing mass 
(logarithm of halo mass range is indicated on the lower right corner of each 
panel). The column on the left is for results obtained with Zform = Zmah- The 
right column is obtained with Zf^nn = Zso- 

satellite, the stellar mass a nd sS FR, and the host halo mass, 
for each galaxy. In Section l4!2l we will make use of this cat¬ 
alog and compare the clustering properties of central galaxies 
selected by sSFR with the predictions from the age-matching 
model. 

As our third set of simulations, we make use of t he catalog 
from the semi-analytic model of iGuo et al.l (1201 Ih . which is 
built upon the halo and subhalo m erger history extracte d from 
the Millennium Run simulation (ISpringel et al.ll2005h . The 
latter follows 2160^ dark matter particles in a (500/2“^Mpc)^ 
box, with the mass and force resolutions of 8.6 x lO*/r'M 0 
and 5/z“*kpc, respectively. The cosmology adopted differs 
from our default one mainly on erg: S2,„ = 1 - Oa = 0.25, 
h = 0.73, (Tg = 0.9. The semi-analytic model provides, as a 
function of redshift, properties such as the total halo mass, 
stellar mass, broadband color, sSFR, and mean mass weighted 
stellar age for the galaxies. We have extracted information 
for central galaxies whose present-day halo mass is M 200 C > 
1 O"/ 2 “'M 0 , and compute d zgp and Zm-dh for the stellar mass as¬ 
sembly history. In Section lATI we will compare the prediction 
of assembly bias from this model with our observations. 


2.4.1. Magnitude of Assembly Bias 

It is informative to examine the expected magnitude of as¬ 
sembly bias as a funct ion of halo mass. We hav e used the 
suite of simulations of IDiemer & Kravtso^ (12014ft to calcu¬ 
late the ratio of real space correlation functions between the 
early- and late-forming halos, in several halo mass bins. In 
each mass bin, we designate halos that have Zform that is higher 
(lower) than the mode of the distribution as early-forming 
(late-forming). For Zform, we consider both zso and Znah- Note 


that the way halos are split into early- and late-forming ones 
here is slightly different from what we will employ when 
comparing to actual galaxy samples (Section |4]i. The results 
are shown in Fig.[T] From bottom to top, each row represents a 
mass bin in order of increasing halo mass. The left (right) col¬ 
umn is obtained when Zform = Zmah (zso)- For our definition of 
halo formation time and the halo sample selection, we expect 
the magnitude of assembly bias to be small - only apprecia¬ 
ble at \og{M 2 QQc/h~^M q) < 13. This informs our decision to 
focus on halos around log(M 2 ooc//F*M 0 ) r; 12 in Sections[3 
and|4] It is interesting to note that while the amplitude of as¬ 
sembly bias is similar for both definitions of Zform, the detailed 
halo mass dependence is somewhat different; we see clearly 
the sign change at log(M 2 ooc/h“'M 0 ) r; 14 when using Zmah 
(that is, younger halos are more strong ly clustered; see also 
iWechsler et al.ll200^ iDalal et al.ll2008ft . but no evidence for 
it with Z 50 for all the mass b ins we have ex amined, which is 
consistent with the results of iLi et aP (12008ft . 

Finally, from Fig. [T] we see that the magnitude of assem¬ 
bly bias at large scales app ears to depend on scales. This 
behavior is seen in both the IDiemer & Kravtso^ (l2014ft and 
Millennium simulations. Such scale dependence of the as¬ 
sembly bias potentially has very significant implications for 
cosmological constraints using large-scale clustering and will 
be investigated in a future publication. 

2.4.2. Robustness of Halo Mass Inference from Lensing 
Measurements 

A potential concern for our analyses in the following sec¬ 
tions is that the halo mass distributions for the early- and late- 
forming galaxy samples may be so different that the lensing 
measurements are biased from the true mean values in dif¬ 
ferent ways. The age-matching mock catalog, together with 
the particle data from the Bolshoi simulation, provide a way 
to check this issue. We have constructed a pair of early- and 
late-forming mock central galaxy samples, selected by stel¬ 
lar mass (as a proxy of halo mass) and sSFR (as a proxy 
of formation time, in the sense that low and high sSFR rep¬ 
resenting early and late formation) such that their projected 
correlation functions at large scales are similar. The true 
mean halo masses for the low and high sSFR samples are 
M2QQC = 1.6 X 1 O'^/ 2 “'M 0 and 2.3 x 1 O*^/ 2 “^M 0 , respectively. 
The halo mass distributions of the two samples are almost 
identical except for the peak locations. We then perform mock 
lensing observations and determine the best-ht NFW masses 
to be 1.7 X lO'^/r*M 0 and 2.3 x lO^^/i“'M 0 . Therefore, the 
ratio of the best-fitting NFW masses from lensing is very sim¬ 
ilar to the real mean halo masses for these samples. This ex¬ 
ercise alleviates the aforementioned concern. 

3. THE DIRECT APPROACH: SELECTION OF HALOS WITH 
ABUNDANCE MATCHING-BASED MASS 

Before presenting our main results in Section |p her e we 
first follow the approach adopted in lYang et ^ 00061) and 
seek signs of assembly bias. However, we add an additional 
step, which is to use galaxy-galaxy lensing to test the fun¬ 
damental assumption that the halo masses in the catalog can 
be used to select early- and late-forming centrals with similar 
halo masses and without the contamination by satellite galax¬ 
ies. These results inform the approach we use in Section|4] 

We start with the sample of central galaxies with halo mass 
within the range \og{M 2 mc/M q) = 12.0- 12.5, according to 
the Y07 catalog. We divide the galaxies into high- and low- 
sSFR samples (containing 75,452 and 61,743 galaxies), with 
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Figure 2. Measurements of projected con'elation function (top panel) and 
surface mass density contrast (bottom panel) for the central galaxies selected 
with the Y07 halo mass within the range log(M 200 c/^©) = 12.0- 12.5, fur¬ 
ther separated into low- and high-sSFR samples (red and blue points, re¬ 
spectively). The middle panel shows the relative bias squared of the two 
samples: the low-sSFR sample has systematically higher bias, but this is 
mainly due to the ~ 1.9 times difference in halo mass: galaxy-galaxy lensing 
indicates the two samples have mass M 200 C of (9.0;|;j’2) x 10 ^*/ 2 “^Mq and 
(4.6]'|^ g) X Mq, respectively. The two curves in the bottom panel 

represent the best-fit NFW profiles (magenta: low-sSFR; cyan: high-sSFR). 

the division at sSFR = 10“" yr“'. The projected correlation 
function and the surface mass density contrast of the sam¬ 
ples are shown in the top and bottom panels of Fig. |2] re¬ 
spectively. In these panels, the red and blue points represent 
the low- and high-sSFR samples, respectively. From the top 
panel we see that the low-sSFR sample has a systematically 
higher clustering amplitude. The ratio of the low-sSFR-to- 
high-sSFR correlation functions, which represents the relative 
bias squared of the two samples (at large scales), is shown in 
the middle panel, and is clearly different from unity (although 
we n ote the points are highly correlated). This is similar to 
what lYang et al.l (120061) have found (although they have sepa¬ 
rated the samples using the SFR, not sSFR). If the two galaxy 
samples have the same halo mass, then this would represent 
an observational evidence of assembly bias. 

In the bottom panel we show the galaxy-galaxy lensing 
measurements of the two samples. They clearly have differ¬ 
ent surface density contrasts. Fitting an NFW model to AS 
over 0.04 < Vp < 0.3/!“'Mpc gives the total halo mass M 200 C 
of (9.0![;^) X lO^r'Mg and (4.6!^;^) x lO“/r'M 0 for the 
low-sSFR and high-sSFR samples, respectively. 

We note that in the Figure, there are non-negligible con¬ 
tributions from satellite galaxies. This can be seen in both 
the correlation function and galaxy-galaxy lensing measure¬ 
ments. For the case of the correlation function, in a pure cen¬ 


Figure 3. Similar to Fig. [ 2 ] but for central galaxies selected by the re- 
solved SFH from VESPA. The red and blue points represent the early- and 
late-forming samples, respectively. Although the early-forming sample has 
higher large-scale bias, this is mainly due to the ~ 3.4 times difference in halo 
mass: galaxy-galaxy lensing indicates the two samples have mass M 200 C of 
(9.7^] g) X 10^^/r^M© and (2.8^]'^) x 10^^ respectively. The two 

curves in the bottom panel represent the best-fit NFW profiles (magenta: 
early-forming; cyan: late-forming). 

tral sample, due to the halo exclusion effect, the signal should 
flatten at scales below the mean halo radius (e.g-. lTinker et alJ 
I2QQ5I) . However, a one-halo term is clearly present in our 
correlation functions. For the case of lensing measurements, 
the “bump” from 0.5 to 2 / 2 ~* Mpc i n AS is also due to the 
presence of satellites (Section 12.3b . Given the mean halo 
masses of the low-sSFR and high-sSFR samples, we can com¬ 
pute the expect ed relative bi a s squa red, using the fitting for¬ 
mulae given in iTinker et al.l (120101) . At the low mass halo 
regime of our galaxy sample, the bias is a slow varying func¬ 
tion of halo mass, resulting in a factor of 1.13 in relative bias 
squared. Given that the measured value is 1.34 ±0.19 (over 
5 - 30/2“'Mpc), it seems possible that contamination from 
satellite galaxies in high mass halos could explain partially 
the difference in correlation function amplitudes. 

We next consider the case where the central galaxy sample 
is separated by the resolved SFH from VESPA. We again start 
with the galaxies within the halo mass range log(M 2 ooc'/ 4 ^ 0 ) = 
12.0- 12.5 (according to the Y07 catalog). We define the 
early-forming galaxy sample to consist of those that have 
formed 50% of their final stellar mass in the first tempo¬ 
ral bin (containing 63,933 galaxies), while the late-forming 
galaxies are those that have formed 50% of the final mass in 
later bins (56,760 galaxies). The results are shown in Fig. [3 
The red and blue points now represent the early- and late- 
forming galaxies, respectively. Again, although the early- 
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forming sample has a higher bias, this is due to its higher 
halo mass: the total masses are (9.7;!;[ g) x lO"/r'M 0 and 
(2.8;!;} [) X 1 O"/!“'M 0 for the early- and late-forming sam¬ 
ples, respectively. The presence of satellites is also apparent 
from both the correlation function and lensing measurements. 
The relative bias squared from the halo mass difference alone 
is 1.25, consistent with the measurement (1.29 ±0.31). 

We therefore conclude that it is likely that either the mean 
relationship between total luminosity and halo mass is incor¬ 
rect for some subsamples of the group catalog, or that the scat¬ 
ter in the Y07 halo mass estimates is not random but rather 
correlates with physical properties of the galaxies (such as 
sSFR and SFH). Simply taking the halo mass estimates from 
the Y07 catalog would lead to false detections of assembly 
bias. We also conclude that some of the central galaxies in 
the catalog are actually misidentified satellites. 

4. OUR TWO-STEP APPROACH: HALO MASS MEASUREMENTS 
FROM WEAK LENSING 

Equijmed with the experience gained from the exercises in 
Section|3l we now present our approach to the detection of as¬ 
sembly bias. Our goal is to construct early- and late-forming 
central galaxy samples that have similar halo masses. As in 
Section |2 we classify galaxies as early- or late-forming via 
either the resolved SFH from VESPA or the current sSER. 
Below we describe our procedures and results with these two 
methods in turn. 

4.1. Classification by Resolved Star Formation History 

The most important finding from Section |3] is that, after 
identifying samples with the same Y07 halo mass estimates 
and splitting by either the SEH or sSER, the mean halo mass 
of the resulting early- and late-forming samples would actu¬ 
ally be quite different. This makes such a procedure an in¬ 
valid way to test for assembly bias. This then motivates us 
to develop a two-step approach: we can start with two cen¬ 
tral galaxy samples (denoted as samples 1 and 2 ) that are 
selected by certain halo mass proxy, according to which the 
mean masses Mi < M 2 . Referring to the mean halo masses of 
the resulting early- and late-forming subsamples after split¬ 
ting the samples by either the SEH or sSER as Mi^, Mi/, M 2 e, 
and M 21 (so that in general Mi^. > My and M 2 e > M 21 ), which 
are then me as u re dhy weak lensing, we can achieve Mi^. ~M 2 i 
(e.g., within Ict) by adjusting the chosen range of mass proxy 
of sample 2 (M 2 ) relative to that of sample 1 (Mi). 

It is also found in Section |3] that there are non-negligible 
satellite contamination in the Y07 central galaxy sample. As 
these satellites are typically residing in halos more massive 
than the ones that host our central galaxies, their large-scale 
bias reflects that of their hosts, and therefore the inclusion of 
these satellites would dilute any assembly bias signal we are 
after. Below we describe details of the steps that lead to our 
final samples. 

4.1.1. Sample Construction 

Our starting point is a halo mass proxy that can help de¬ 
fine samples 1 and 2. Eor the SEH-selected samples, we 
choose to use the central galaxy stellar mass-halo mass re- 
lationship for this purpose, and adopt the measurements by 
iMore et ^ (1201 IL hereafter Mil), which are done separately 
for red and blue centrals (see the solid curves in Eig. |4]i, us¬ 
ing satellite kinematics. Using these relations as an approxi¬ 
mate guideline, we can select central galaxies in certain stellar 



Figure 4. The red and blue curves represent the central galaxy stellar 
mass-halo mass r elations for the red and blue centrals, respectively, from 
IMore et HI <201 ID . The shaded regions bounding the curves represent the 
1(7 uncertainty around the mean. The two horizontal bands represent the first 
step in our sample construction (i.e., defining the samples 1 and 2). By select¬ 
ing red and blue galaxies with stellai* mass in the ranges where the horizontal 
band crosses over the two curves (delineated by the vertical dashed and dotted 
lines), we obtain galaxy samples that may have similar halo masses. These 
samples are further split into eaily- and late-forming galaxies by the VESPA- 
based SFH. Using weak lensing it is found that the late-forming galaxies 
derived from the samples represented by the green horizontal band and the 
early-fonning galaxies originated from the samples defined by the dark red 
horizontal band have similar mean halo masses. We note that the halo mass 
definition in the Mil relations is instead of M 200 C adopted through¬ 

out our analysis. As we use these relations for adjusting the relative masses 
of galaxy samples, the absolute mass scale (and thus the mass definition) is 
not important for our purpose. 

mass ranges so that they might live in halos of similar masses. 
As suggested by Eig. U we have to combine blue centrals of 
higher stellar mass with red ones of lower stellar masQ The 
VESPA-based SEH is then used to classify the galaxies into 
early- and late-forming ones, for which mean halo masses are 
measured by galaxy-galaxy lensing. 

In implementing the above procedure, we follow Mil and 
estimate the stellar mass via 

logf =-0.406-1-1.097(g-r) 

^\Ii-^MqJ ( 10 ) 

- QA{Mr - 5 log h-4. 64) 

(see also iBell et al.ll20^ . Here M^ is the SDSS r-band ab¬ 
solute magnitude, (g - r) is estimated in the rest-frame; for 
both of these we use the SDSS Petrosian magnitudes with k- 
corrections taken fro m the NYU value-added galaxy catalog 
(iBlanton et al.l2005h . After examination of our galaxies in the 
color-stellar mass space, we adopt a red/blue division line that 
is somewhat different from that used in Ml 1: 

(g-T)div = O.67 + O.O31og[M,tar/(lO'V2M0)]. (11) 

The next step in our procedure is the removal of satellites. 

This Figure provides a way to understand the relative halo masses for 
the low- and high-sSFR samples studied in Section [3] Selection by sSFR 
is equivalent to a color selection. We also know red galaxies typically have 
larger stellar mass-to-light ratio than blue galaxies do. Since the Y07 halo 
mass estimates are based on the luminosity content of the groups which, at 
the low mass scale we study, is the same as the luminosity of the central 
galaxies, this implies the low-sSFR sample would have higher stellai' mass 
than the low-sSFR one; from Fig.|4]we see it is natural for the two samples 
to have different halo masses. 
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We have found that satellite contamination is more serious in 
red galaxy samples (i.e., more enhanced 1 -halo term in the 
galaxy clustering signal). Thus, in the above process, after 
a re d central sample is selected, we apply a FOF algorithm 
(see iJian et al1l2014l for the description of the code) to iden¬ 
tify groups of galaxies, using comoving linking lengths along 
and perpendicular to the line-of-sight of 1 and 0.08 (in units of 
the mean galaxy separation^ respectively. We only keep the 
most massive member in the groups identified. During this 
step, we consider not only the red galaxy sample in question, 
but also other red centrals in the Y07 central catalog, as well 
as blue centrals in halos with log(M 2 oo*//^~*A/Q) > 12.5 (ac¬ 
cording to the Mil relations). Then, after combining the blue 
central sample with the satellite-trimmed red central sample, 
the FOF algorithm is applied again to further remove remain¬ 
ing satellites. Only after these steps do we separate the sample 
into early- and late-forming ones. About 7- 15% of galaxies 
are removed this way. 

It should also be noted that not every galaxy can be unam¬ 
biguously identified as early- or late-forming from the VESPA 
SFH, as we have opted for purity and dema nded consistency 
between the two dust models (Section [2. 11 1 . The early- and 
late-forming sample sizes would therefore be further reduced 
(by 5-17%) compared to the parent samples. 

We have thus obtained a pair of early- and late-forming 
samples that we believe are dominated by central galaxies 
in their halos and have similar mean halo masses. The 
way they are constructed is depicted in Fig. @1 In the Fig¬ 
ure, two horizontal bands (shaded regions) are shown around 
\og{M 20 (ib/h~^MQ) « 12.4. Each band denotes the initial halo 
mass range (M\ and M 2 in the notation used above) guessed 
from the central stellar mass-halo mass relations. The green 
one represents sample 2 from which a late-forming sample is 
derived. Similarly, the yellow band represents sample 1 from 
which an early-forming sample is obtained. More specifically, 
the log stellar mass ranges for which the red and blue galax¬ 
ies are selected are 9.9-10.2 and 10.2-10.45 (10.35-10.6 
and 10.52-11.1) for sample 1 (2). These ranges are de¬ 
lineated by the vertical dashed and dotted lines. The re¬ 
sulting early- and late-forming samples contain 18,200 and 
26,071 galaxies, respectively. Using galaxy-galaxy lensing, 
it is found that these samples have mean masses (Maooc) that 
are statistically consistent: M\e = (9.5^fQ) x lO"/i“*M 0 and 
M 2 / = (8.4^j|) X 1 O"/ 2 “^M 0 (see bottom panel ofEig.|5]). Be¬ 
low we refer to these as our SEH samples. 

In the top and middle panels of Eig. |5] we show the com¬ 
parison of the correlation functions for these samples. It is 
clear that the power at small scales (e.g., < 1 /r“'Mpc) is much 
reduced compared to Eigs. |2] and [3 due to the removal of 
satellites with our EOE procedure. (The lensing signal shown 
in the bottom panel matches the NEW fit for Vp < 2/r“*Mpc, 
which further supports the conclusion that we do not have an 
appreciable satellite population in the final galaxy samples.) 
At large scales (r^ > 5/z“*Mpc), we see in the middle panel 
the (square of) relative bias scatters around unity (indicated 

Since we use a flux-limited sample, the mean galaxy separation 
vai'ies with redshift. The resulting comoving linking lengths range from 
0.18 to 1.92/r“*Mpc (perpendicular to the line-of-sight) and from 2.20 to 
23.96/r“^Mpc (along the line-of-sight). We note that our choices of linking 
lengths (100% and 8% of the mean galaxy separation along and perpendicu¬ 
lar to the line-of-sight) have not been rigorously tested against mock catalogs; 
rather they are only validated by inspection of the lensing and clustering sig¬ 
nals to confirm the absence of a 1-halo (satellite) term. 
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Figure 5. Measurements of projected congelation function (top panel) and 
surface mass density contrast (bottom panel) for the central galaxies ini¬ 
tially selected with the Mil central-halo relations (see the green and dark 
red horizontal bands in Fig. [4j, then further separated into early- and late- 
forming samples using the VESPA-based SFH (red and blue points, respec¬ 
tively). The points in the middle panel shows the relative bias squared of 
the two samples. The square root of the mean ratio (over 5 —35/2“*Mpc) 
is 1.00 ±0.12. Galaxy-galaxy lensing indicates the two samples have mass 
^ 200 c of ( 9 . 5 ^ 20 ^ ^ 10*^and (8.4^^ g) x lO^‘/i“*A/ 0 , respectively. 
The curves in the bottom panel represent the best-fit NFW profiles (magenta: 
early-forming; cyan: late-forming). In the top panel, the two short dashed 
curves show the predictions for the eaidy- and late-forming halo s from the 
L0250 simulation. The long dashed curves are those from the IGuo et all 
nnm model, based on the Millennium simulation. The ratios of the two 
sets of curves (early-to-late) are shown as the curves in the middle panel 
(short dashed: L0250; long dashed: Millennium), with the shaded regions 
representing the Icr uncertainties from the models. These models are found 
to be inconsistent with the observations (see text for details). 

by the horizontal dotted line), suggesting that there is not a 
strong difference in the large-scale clustering of the two sam¬ 
ples. Over the scales 5 - 35 / 2 “'Mpc, the square root of the 
mean ratio is 1.00 ± 0 . 12 . 

4.1.2. Comparison with Theoretical Expectations 

To properly interpret our findings, we first compare the ob¬ 
served relative bias with that ob taine d from the z = 0 output 
of the L0250 simulation (Section lZTl i. Note that observation- 
ally we can only robustly measure the mean halo mass of our 
galaxy samples, but not the form of their halo mass distri¬ 
bution. To proceed, we thus make the assumption that the 
distribution of halo mass of the observed galaxy samples fol¬ 
lows a log-normal forirQ with parameters Mcen and criogM 

** The motivation for adopting th is fo rm comes from the mock catalog 
of the age-matching model (Section ED. For each galaxy in our sample, 
we assign a counterpart in the mock by matching the stellai' mass and sSFR, 
and find that the resulting halo mass distribution of the matched mock galaxy 
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representing the mean and standard deviation of the Gaus¬ 
sian in log space. To find out the values of Mcen and criogM 
appropriate for our galaxy samples, we consider a grid of 
combinations of these parameters, with logMcen ranging from 
11.4 to 12.5 (with an interval of 0.1 dex), and CTiogM from 
0.05 to 0.45 (with an interval of 0.05). Given an observed 
galaxy sample, for each combination of (Mcen, <7iogM), we pre¬ 
dict the corresponding AE signal and obtain by comparing 
with the observed profile. We consider all models that satisfy 
^ Xmin + 2.3 as plausible (corresponding to 68% proba¬ 
bility interval for two parameter£3), where Xmin given by 
the model with minimum on the grid. To further split the 
halo samples into early- and late-forming ones, we proceed 
as follows. Given the median redshift Zmed of an observed 
galaxy sample, we compute the redshift Zdiv corresponding 
in lookback time 9Gyr prior to Zmed, and use it as a proxy 
for the boundary of the first temporal bin in VESPA. That is, 
an early-forming (late-forming) galaxy sample typically has 
formed 50% of its final stellar mass before (after) Zdiv Zdiv 
for our early- and late-forming galaxy samples are 1.84 and 
2.11, respectively. For each of the models that can represent 
an observed early-forming (late-forming) galaxy sample, we 
select halos with Zform > Zdiv (zform < 2div) from the simulation 
and compute the correlation function. Our theoretical expec¬ 
tation is then the average over the correlation functions from 
all these models. The uncertainties are estimated from jack¬ 
knife resampling after splitting the simulation box into 125 
sub-cubes. We note that the acceptable models show a degen¬ 
eracy between Mcen and uiogM (in the sense that the lower the 
mass, the higher uiogM), and have Mcen ranging from 11.6 to 
12.0, and CTiogM from 0.05 to 0.45, and the spread in correla¬ 
tion functions from these models is much smaller (5-6%) than 
the mean values. The fraction of early-forming halos among 
the halos used in the models is 0.37, fairly close to the ob¬ 
served sample (0.41). 

In the top panel of Fig. |5] we show as magenta (cyan) 
short dashed curve the projected correlation function from 
the early-forming (late-forming) halo sample. Here we have 
adopted Zmah for Zform, but using zso leads to similar conclu¬ 
sions. The ratio b^^^{rp) of the early-forming-to-late-forming 
halo correlation functions from the simulation is shown as the 
dashed curve in the middle panel of the Figure, with the (pink) 
shaded region enclosing the Ict uncertainties from the model. 
Using the covariance matrices built from the ratio of the early- 
and late-forming projected correlation functions and their as¬ 
sociated jackknife samples for both the observations and mod¬ 
els (O and T, respectively), we infer the probability of the ob¬ 
served and theoretical relative bias squared to be consistent 
with each other from 

X^ = Yliwe(n)/wi(n)-bl^(n))D~l {we(rj)/wi(rj)-bl^(rj)), 

u 

( 12 ) 

where We and w/ are the observed early- and late-forming 
projected correlation functions, and D = 0 + T. For nota- 
tional simplicity, we have omitted the subscript p in Wp and 
rp. For these calculations, we consider only Vp bins in the 
5 - 35/2“*Mpc range. We find that the theory and our ob¬ 
servation are inconsistent. Given 5 degrees of freedom, the 

sample is very close to log-normal, with a standard deviation of 0.2—0.3. 

Our conclusion remains unchanged if the criterion for acceptable mod¬ 
els is changed to 'x} < x^j^ + 4.61 (i.e., 90% probability distribution for two 
parameters). 


probability to yield a x? as large as observed (27.3) if the 
data and theory were drawn from the same distribution is only 
p = 5.0 X 10“^. The corresponding x^ when zso is adopted is 
64.1, corresponding to a probability well below 10“'°. 

We next turn to th e comparison with the predictions from 
the iGuo et akl (120111) semi-analytic model. Analogous to the 
approach described above, for a given central galaxy sam¬ 
ple, we again start with simulated halo samples permitted 
by the observed AE profile, then select early-forming (late- 
forming) halos as those hosting central galaxies with Zform > 
Zdiv (Zform < Zdiv)- Here Zform = Zmah i s Calculated from the stel¬ 
lar mass assembly history from the iGuo et all (1201 Ih model. 
The uncertainties in the model predictions are also calculated 
with jackknife resampling of the simulation box. The pro¬ 
jected correlation functions for the early- and late-forming 
models are shown as orange and purple long dashed curves 
in the top panel of Fig. [^respectively, and the ratio of the two 
is shown as the long dashed curve in the middle panel (the 
oran ge s haded region represents the Ict uncertainties). Using 
Eqn.TrSl we find that the probability that the model and data 
are consistent is p = 0.019 with x^ = 13.5. We note, how¬ 
ever, that the fraction of early-forming halos among the halos 
used in the models is only 0.04, far below that of the observed 
sample. Given that this model also produces a sSFR distribu¬ 
tion that is not consistent wit h the observed sSFR, we will not 
further consider it in Section l4A] 

It is possible that at mass scales lower than Mni (as is 
the case for our samples), the origin of the asse mbly bias is 
partia lly due to the so-called “backsplash” halos (iDalal et al.l 
120081) . those that have been accreted onto massive halos but 
are on highly elongated orbits and thus would have spent a 
substan tial amount of time outside of the parent halo’s virial 
radius (IWetzel et al.l 120141) . The member galaxies of these 
backsplash halos would appear old because they have been 
affected by dense environments of the massive halos, and the 
bias of these small halos would be that of their massive parent 
halos. Our treatment of satellite removal may exclude some 
of such halos from our sample, and therefore inadvertently 
suppress the effect of assembly bias. Unless there are clever 
ways to observationally distinguish the backsplash halos from 
low mass halos that are unrelated to nearby, massive halos, 
it seems to be a challenge to analyses like ours to only re¬ 
move satellites that are bounded within massive halos, but not 
backsplash galaxies. Perhaps a better (and more feasible) ap¬ 
proach is to mimic the effect of satellite removal in simulated 
data. Although a full-blow n ana lysis is beyond the scope of 
the present paper, in Section lS^ we will use a simple method 
to roughly estimate the effect of possible exclusion of back¬ 
splash halos to the non-detection of assembly bias. 

Ideally, we would like to explore the assembly bias with 
halos over a wid e mass range, similar to what is done in 
lYang et ^ (120061) . Unfortunately, at the low mass end, we 
are limited by the inability of VESPA to yield sufficient num¬ 
ber of early-forming galaxies (likely due to the quality of 
SDSS spectra), thus rendering the lensing measurement too 
noisy, while at higher mass end, both the facts that the num¬ 
ber of central galaxies decreases precipitously, and that most 
of them are quite old (beyond the temporal resolution offered 
by VESPA), make it more difficult to apply our method. 

4.2. Classification by Specific Star Formation Rate 

W e ne xt follow a procedure similar to that outlined in Sec¬ 
tion 14.11 for samples selected by sSFR. As we now need 
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Figure 6. Measurements of projected con'elation function (top panel) and 
surface mass density contrast (bottom panel) for the central galaxies initially 
selected with the halo mass estimates given in Y07, then further separated 
into low- and high-sSFR samples (red and blue points, respectively), with 
the division at sSFR= 10“^^’^ yr“*. The middle panel shows the relative bias 
squared of the two samples. The square root of the mean ratio (over 5 — 
35/2“*Mpc) is 1.07 ±0.14. Galaxy-galaxy lensing indicates the two samples 
have massM 200 c of x IO'^/y'Mq and ( 1 . 26 ;^' 20 ) x 10 ^^/y'Mq, 

respectively. The curves in the bottom panel represent the best-fit NFW pro¬ 
files (magenta: low-sSFR; cyan: high-sSFR). In the top panel t he two dashed 
curves show the predictions from the age-matching model iWatson et al.l 
IMl. The ratio of the model curves (low-to-high sSFR) is shown in the 
middle panel. After taking into account for the model uncertainties (repre¬ 
sented by the shaded regions), we find that the model is inconsistent with the 
observations. 

to consider high- and low-sSFR centrals separately, opera¬ 
tionally it is much more convenient to use the mass estimates 
from the Y07 catalog, rather than using the Ml 1 relations, in 
selecting samples 1 and 2. In constructing the galaxy samples, 
we again use the FOF algorithm to remove contaminations 
from satellite galaxies. 

We have thus arrived at a pair of low- and high-sSFR 
samples with similar mean masses. They are selected by 
log(M 2 ooc/M 0 ) = 12.0-12.5 and sSFR < 10-"*^yr-', and 
log(M 2 ooe/M 0 ) = 12.75-13.10 and sSFR > 10-" *^yr-', re¬ 
spectively. After the FOF step that removes about 10% of the 
galaxies, the two samples contain 25,838 and 29,659 galax¬ 
ies. The lensing measurements are shown in the bottom panel 
of Fig. | 6 ] giving the mean masses of M 200 C = ( 1 . 39 ;|;o 2 i) x 
10‘2/r‘MQ and (1.26!^;^;;) x lO^^h-^MQ. The correlation 
functions and the ratio of low-to-high-sSFR correlation func¬ 
tions are shown in the top and middle panels. The square root 
of the mean ratio (over 5-35/r'Mpc) is 1.07 ±0.14. Given 
that the data points are all above un ity, w e use a procedure 
analogous to that described in Section lTTI to examine the con¬ 


sistency between the two samples by calculating 

= ^{wi,{ri)-Whs{n))0~l {wis(rj)-Whs(rj)), (13) 

U 

where w/j and wi,s are the projected correlation function of 
the low-sSFR and high-sSFR samples, and O the covariance 
matrix built from the difference between wis and wi,s and their 
associated jackknife samples. With = 5.5 from 5 degrees of 
freedom, we find that the two samples have a 36% probability 
to be consistent. 

In the top and middle panels of the Figure, we compare our 
measurements with predictions fr om t he age-matching model 
(I Watson et al.ll2015t see Sec tion nail. Following the proce¬ 
dure outlined in Section 14.11 we construct halo samples that 
can represent the observed galaxy samples by approximating 
the halo mass distribution as log-normal, and considering halo 
samples that provide good fits to the observed AS profile. We 
adopt the same division criterion for low- and high-sSFR as 
in the real data (sSFR = 10“** * yr“*). From the curve in the 
middle panel (representing the ratio of the two model curves 
shown in the top panel) we see that the age-matching model 
prediction for is 1.6. We find (x^,p) = (10.5,0.033), 
indicating the model is inconsistent with the observation. 

As the age-matching model is calibrated against a SDSS 
galaxy sample that is slightly different from the one we use 
(IWatson et al.ll201^ . it is possible the sSFR distribution in 
the model does not match perfectly with our data. After in¬ 
specting the cumulative sSFR distributions in both the real 
data and the model, we find it more appropriate to set the 
sSFR division to be 10“**^yr“* in the model for the com¬ 
parison. With this adjustment, the fraction of early-forming 
halos in the model is 0.41, which is not far from the observed 
value of 0.47. The results become (x^,/?) = (20.5,4.0 x 10“^). 
Therefore, our observations do not seem to be compatible 
with the age-matching model, in which the mag nitu de of as¬ 
sembly bias is expected to be maximal (Section r2.4b . Such a 
result may not be surprising, given that the central galaxy stel¬ 
lar mass-halo mass relationships based on the age-matching 
model (particularly that for blue galaxies) do not match well 
with the real data, as in dicated by lensing measurements 
(iMandelbaum et aT1l2015l) . 

5. DISCUSSION 

Although we have attempted to make the comparisons be¬ 
tween observations and theoretical predictions as fair as possi¬ 
ble, there are caveats in our analysis that we need to point out, 
which may be applicable for other studies of assembly bias 
as well. These can be broadly categorized into two themes: 
( 1 ) definition of formation epoch, and ( 2 ) treatment of galaxy 
and halo sa mples . We discuss these aspects in turn in Sec¬ 
tions 15.11 and 15.21 After addressing these potential concerns, 
in Section lSTSl we comme nt on the halo mass estimates of Y07 
and Mil, and in Section |±4] we consider the implications of 
our findings. 

5.1. Definition of Formation Time 

A proper definition of both halo and galaxy formation time 
is critical, as it affects the expected amplitude of assembly 
bias as a function of halo mass, and more importantly, it de¬ 
termines whether one can faithfully link the observed galaxy 
population to the underlying dark matter halos. As we have 
seen before, the relative bias between early- and late-forming 
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halos is a function of halo mass, and the detailed mass depen¬ 
dence actually depends on whether Z 5 o or Zmnh is adopted for 
^form (FtE- m. This in turn affects any comparison with obser¬ 
vations, esp ecial ly the inference of the statistical significance 
(c.f. Section lATT i. 

How can we best link the observed properties of galaxies 
to the formation history of the host dark matter halos? Vari¬ 
ous groups have adopted different ways to define the forma¬ 
tion time for the central galaxies, including the current sSFR 
(or equivalently, broad band optical color) and luminosity- 
weighted mean age of the stellar populations (Section[T]i. One 
of the new aspects of our study is the use of resolved SFH 
from the VESPA algorithm. To gain insight into the answer 
to the above question, we make use of the results from the 
semi-analytic model of iGuo et al.l (1201 Ih . We have extracted 
information from ^ 157,000 central galaxies whose present- 
day halo mass is Maooe = (1 “2) x IO'^/a'Mq, and computed 
Z 50 and Zmah for both the total mass and the stellar mass as¬ 
sembly history (in the following denoted with a subscript “t” 
and “s”, respectively). 

After examining the correlations between the various galac¬ 
tic properties (including Z 5 o,s and Zmah.s) and the halo forma¬ 
tion time (zso.t, Zmah.t), we have found that the best correlation 
is between Zmah.s and Zmah.t (with Pearson correlation coeffi¬ 
cient r = 0.83), followed by that between zso^s and Z 5 o,t (r = 
0.54), age and z^ o.t (r = 0-49), and Z 5 o,s and Zmah.t (r = 0.36). 
Therefore, if the iGuo et akl (1201 Ih model is a good approxi- 
mation to the real galaxy populations (see e.g.. lLin et al.l20ll 
for the agreement between the model prediction and observa¬ 
tion for the stellar mass assembly history of brightest cluster 
galaxies at high redshift), with a suitable choice of formation 
time indicators, it is possible to infer the halo formation his¬ 
tory from that of the central galaxies, for the low mass halos 
we consider here. Furthermore, if Z 5 o,s derived from VESPA 
is representative of th e true value on average, our results pre¬ 
sented in Section lrTl are on solid footing. 

Ideally, we would like to calculate Zmah.s for the observed 
galaxy populations; however, we refrain from deriving it from 
the VESPA-based SFH here as the binning in lookback time 
for the public VESPA data is not optimal for this purpose. 
Eurthermore, it remains to be seen if VESPA-based SFH 
would result in a unbiased Zmah- In future work we plan to also 
investigate the use of other formation time indicators such as 
the strength of the 4000 A break and the luminosity weighted 
mean age for the study of assembly bias. 

In principle, the signature of assembly bias would be 
stronger if one uses the extrema of the distribution. Due to 
the temporal resolution of VESPA SFH, in our analysis, ap¬ 
proximately we designate galaxies that have zso,® ^ 1-8 as 
earl y-for ming, and those with zso^s 2.1 as late-forming (Sec¬ 
tion lO . Should we have a higher resolution SFH, we could 
have examined the distribution of zso.s (or Zmah.s), and only 
used the earliest- and latest-forming 20% for the clustering 
and lensing measurements. Such an analysis would require 
both much better quality spectra and much larger sample size 
(or much deeper imaging data than SDSS), however, and is 
therefore currently not yet feasible. 

5.2. Treatment of Galaxy and Halo Samples 

One potential concern regarding our way of constructing 
pure samples of central galaxies is the FOF removal of satel¬ 
lites. While we believe this is a necessary operation, some 
of the galaxies thus removed may be those in the so-called 


backsplash halos, which may partially co ntribute to the sig¬ 
nal of assembly bias at low mass scales (I Wang et al.l 120071 
iDalal et ^l20()8t see Section l4Tt . If true, it may compli¬ 
cate the interpretation of our results (in other words, the non¬ 
detection of assembly bias may be due in part to the removal 
of these galaxies). 

Here we attempt to evaluate the effect of satellite removal 
on the magnitude of assembly bias using simulated data. We 
repeat the procedure of Section l^Tl and consider halo samples 
that have a mass distribution following the log-normal form. 
The acceptable combinations of (McenjCriogAf) are again those 
that give < X^in + 2.3 for a given observed galaxy sam¬ 
ple. For a halo sample constructed from a given parameter 
set of (McenjCiogM), we remove halos that are located within 
2 r 2 ooc from any of more massive halos in the whole simulation 
box (that is, not restricted to those that satisfy the log-normal 
mass distribution). For the remaining “isolated” halos, which 
may correspond crudely to our satellite-trimmed galaxy sam¬ 
ple, we apply the Zform > Zdiv (zform < Zdiv) criterion as before 
to further filter the halos, depending on whether the observed 
sample in question is early- or late-forming. Finally, the theo¬ 
retical expectation is obtained by averaging over the correla¬ 
tion functions from all accep table models. For the L0250 sim¬ 
ulation used in Section 14.11 typically this procedure removes 
about 8% of the halos, and reduces the ratio of the resulting 
correlation functions (early-to-late) by ~ 10% compared to 
that before the removal. Such a reduction would make the¬ 
oretical predictions more compatible with our observations. 
We find that the probability that the theory and the observed 
data are drawn from the same distribution is p = 0.055 (with 
2form = Zmah)- For the age-matching halo sample (Section lA2l i. 
about 30% of the halos are removed, resulting in a larger re¬ 
duction in the ratio of correlation functions, and the proba¬ 
bility for the model to be consistent with the observations is 
p = 0.0041. 

Admittedly this procedure is rather crude, but it should cap¬ 
ture the essence of the effect of backsplash halos. We see that 
indeed such halos may contribute partl y to the assembly b ias, 
which is consistent with the findings of lWang et al.l (l2009h . 

Another potential concern is the form and width of the mass 
distribution of the real galaxy samples. When matching an 
early-forming sample with a late-forming one, we only re¬ 
quire their mean halo masses to be consistent (e.g., within 
Icr), but not the distribution in mass. Our assumption that 
the halo mass distribution follows the log-nor mal form is in¬ 
formed by the age-matching model (Section 14.1b . and with 
this form the centroid and width of the distribution may be 
roughly determined (as constrained by the obseryed AE pro¬ 
file). Although we haye limited the mass range during the 
initial sample selection to be « 0.5 dex or smaller (based on 
the central stellar mass-halo mass relation, or the mass giyen 
in the Y07 catalog), we note these proxies are deriyed with 
the presence of satellites, and thus may bias the actual mass 
spread in an unpredictable way. A possible way forward is to 
construct (satellite-trimmed) central-halo relations with SFH 
or sSFR-selected samples (rather than those selected by stel¬ 
lar mass or luminosity thresholds, as commonly used), and 
use them to better constrain (McenjCiogM) of the galaxy sam¬ 
ples in question. 

Finally, when we fit the NFW profile to the lensing mea¬ 
surements, we haye assumed that the concentration is only a 
function of halo mass, and ignored any dependence of concen¬ 
tration on the halo formation time. This is mainly because the 
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S/N of the galaxy-galaxy lensing profiles is insufficient to fit 
for concentration as well. Such an assumption has the effect 
of potentially biasing the masses of our galaxy samples, in the 
sense that the mass of the early-forming (late-forming) sam¬ 
ple would be overestimated (underestimated) when the mean 
concentration is used. From the L0250 simulation, we have 
found that the concentration is ~ 60% higher in early-forming 
halos than in late-forming ones. Adopting such a difference in 
our lensing measurements leads to roughly a 10 % reduction 
(increase) in halo mass for the early-forming (late-forming) 
galaxy sample. Given that the difference in mass as presented 
in Section |4](i.e., when the mean concentration is used), as 
well as our mass measurement uncertainties, are both also 
at similar levels ( 10 - 20 %), and that observationally the dif¬ 
ference in concentration of halos hosting red and blue cen- 
tral galaxies is much sma ller than the 60% value used above 
(iMandelbaum et aLl 120151) . we conclude that assuming con¬ 
centration is only a function of halo mass does not have an 
appreciable effect in our analysis. 

5.3. Comment on Abundance Matching- and Satellite 
Kinematics-based Halo Mass Estimates 

In Sections [3 and 0] we have used the halo mass estimates 
from Y07 and Mil, which are based on an approach similar 
to abundance matching, and on satellite kinematics, respec¬ 
tively, to guide our initial sample selection. Although we have 
not carried out a syste matic comparison of thes e estimates 
with weak lensing (see IMandelbaum et all 120151 for such an 
effort), during our two-step procedure of constructing galaxy 
samples, we have built three samples (selected only by color) 
as a by-product that can test the Mil relations. It appears 
that the red and blue central stellar mass-halo mass relations 
from Mil give rise to consistent halo masses, although these 
masses are ~ 40% higher than t hat indicated by lensin g. This 
is consistent with the finding of iKravtsov et akl (|20l3)- 

As for the mass provided by Y07, which is obtained in 
a fashion similar to the abundance matc hing technique, our 
lensing measurements in Sections [3 and 14.21 imply that the 
scatter in their mass estimates somehow correlates with basic 
physical properties of galaxies such as SFH or sSFR, at least 
at the low halo mass regime we study here. Therefore, any 
study that assumes or requires the scatter in halo mass to be 
random should be cautious when adopting the mass estimates 
from this catalog. 

5.4. Implications of Our Results 

As we have demonstrated i n Sectionl2.41 together with nu¬ 
merous previous studies (e.g., iGao et al.l2005 t lWechsler et alJ 
l2006t iJing et al]l2007l: iLi et al.ll2008h . the halo assembly bias 

is a marked feature of the CDM model, especially at the low 
mass scales we study (^ 10'^/r'M 0 ). How can this be rec¬ 
onciled with our lack of detection in galaxy populations? It is 
possible that the baryonic processes of galaxy formation have 
rendered the signal small, the SFH derived by VESPA is too 
nois y so that the signal is washed out (for the case of Sec- 
tion l4.1b . the sSFR measurements from SDSS are too noisy to 
be a good indic ator of the halo formation epoch (for the case 
of Section l4!2l l. or we have not yet found a galactic property 
that is closely linked to the halo formation history. 

To check the first po ssibility, we aga in make use of the 
galaxy catalog from the lGuo et al.l (1201 ll) model. For central 
galaxies living in halos of mass M 200 C = (1 “ 2 ) x 1 O'^/ 2 “'M 0 , 
we compute the mean of ratio ■feariy/Ciate over 5-20/i“*Mpc, 


where the early- and late-forming samples are defined by sev¬ 
eral different conditions: ( 1 ) by the halo Zmah.t, ( 2 ) by the stel¬ 
lar Zmah,s, (3) by the stellar Z 5 o,s, and (4) by the stellar age. For 
each of the formation time indicator, we designate the galax¬ 
ies with value higher (lower) than the mode of the distribution 
to be early-forming (late-forming). Comparing the ratio from 
cases (2)-(4) with that of case (1) would inform us whether 
the signature of assembly bias is erased or not with (this par¬ 
ticular model of) galaxy formation. It is found that the ratio 
for cases when the formation history is inferred from galac¬ 
tic properties is similar to, and not smaller than that derived 
purely from the dark matter halo assembly history, and thus 
it is probable that gal axy formation preserves assembly bias 
('c.f. lWang etaHIIOll . 

It is certainly possible that the VESPA-based SEH is 
too noisy. Tests w ith mock galaxy spectra carried out in 
iToieiro et alJ (l2009h indicate that for simple SEHs (expo¬ 
nential decay or dual-burst), the recovery of SEH is satis¬ 
factory. The SEH of our central galaxies inevitably would 
be much more complicated than the simple cases tested 
above, and whether VESPA can reliably decipher the SEH 
needs to be checked with other to ols such as STAR LIGHT 
(ICid Eernandes et al.ll2005h. EA ST (iKriek et al.lIM)^, MAG- 
PHYS (Ida Cunha et al.l l2^ . PIREPLY (IWilkinson et af] 
IM5b . 

In Section|23]we have noted that in the lWatson et alJ (120151) 
model, the sSPR of a central galaxy is assumed to have a 
one-to-one correspondence with the formation epoch of the 
host halo. The assembly bias is thus maximally built-in in 
this model. The fact that the probability for the model and 
our observati ons t o be consistent is at most only at percent 
level (Section lS^ suggests that either intrinsic scatters in the 
sSPR-formation epoch correspondence are much larger than 
assumed in the age-matching model (thus rendering the ef¬ 
fect of assembly bias too small to be detectable), or the sSPR 
measurements from SDSS are not adequate for picking up the 
assembly bias signal, or a combination of both. 

Pinally, we entertain the possibility that we have yet to em¬ 
ploy a new galactic property to better separate galactic sys¬ 
tems into early- and late-forming ones, before we can un¬ 
ambiguously identify the asse mbly bias signature in the Uni¬ 
verse. As discussed in Section lYll the best candidate appears 
to be Zmah,s, follow ed by z-jo s and t he mean age (see also the 
proxy discussed in lLim et a^l2016^ . These in principle could 
be obtained from high quality spectra with VESPA, as well as 
the aforementioned codes, and will be subjects of our future 
in vestigation. Inspiration could also come from the analysis 
of iMivatake et alJ (120151) . who have recently claimed a de¬ 
tection of strong assembly bias using galaxy clusters. Their 
proxy for halo formation time is (Rmem), the mean projected 
separation of member galaxies from the cluster center. In 
principle it is possible to adopt a similar proxy for galaxy 
scale halos, although the number of satellites is much smaller, 
and the membership determination is less certain. 

We conclude by noting that it is also imperative to better 
understand theoretically the origin of assembly bias across 
the mass spectrum of halos. If the cause at low mass scales 
is the backsplash halos, then instead of identifying the best 
formation time indicator, one should look for ways to obser¬ 
vationally distinguish such galactic systems, or to take such 
population fully into account when comparing with theoret¬ 
ical models. Regarding the high mass end, since it is likely 
that the physical origin of assembly bias in high-mass ha¬ 
los is believed to be quite different than for low-mass halos 
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( Dalai et ani2008h . the detection presented by iMivatake et alJ 
( 20151) is not necessarily inconsistent with the non-detection 
reported here. It is clear that the assembly bias phenomenon 
is far richer and complicated than expected, and warrants fur¬ 
ther investigations, such as exploring better proxies for halo 
formatio n time , and the scale dependence of assembly bias 
(Section l2.4.1b . 

6. SUMMARY 

As the assembly bias is a robust prediction of the CDM 
theory of structure formation, establishing it observationally 
would not only further vindicate this extremely successful 
theory, but also shed light on the baryonic physics of galaxy 
formation. In this exploratory study of detection of assembly 
bias, we have shown (with the aid of weak gravitational lens- 
ing) that some previous claims of detection may be simply 
due to differences in halo mass of the galaxy samples, rather 
than a real manifestation of assembly bias (Section O. We 
have then investigated a couple of ways of constructing galaxy 
samples with similar mean halo masses, thus facilitating a di¬ 
rect search for assembly bias in the real data (Section IJi. We 
focus on the halo mass scale of ~ 1 O'^/ 2 “*M 0 , where assem¬ 
bly bias is expected to be large, and construct early- and late- 
forming halo samples by making use of a central galaxy cata¬ 
log (Y07), under the assumption that the SFH of central galax¬ 
ies reflects the formation history of underlying halos (which is 
well su ppor ted by a state-of-the-art galaxy formation model, 
Section lSTb . Satellite galaxies living in massive halos that are 
misidentifled as central galaxies in low mass halos we target 
would bias the mass estimates of the samples, and thus need 
to be removed. We employ a FOF algorithm to achieve this. 
Working with satellite-free samples, we consider two ways of 
inferring the formation epoch of the cen tral g alaxies, namely 
the resolved S FH f rom VESPA (Section [4.lb and the current 
sSFR (Section lA2t . 

In both cases, after making sure the mean halo masses of 
the early- and late-forming samples are comparable with each 
other from weak lensing, we compare the relative large-scale 
bias of the samples with predictions from numerical simu¬ 
lations, finding the probability of the model and data to be 
consistent is very low. We attribute this inconsistency to the 
possibilities that the formation epoch indicators we use are 
too noisy as derived from current data, or they do not cor¬ 
relate well with the actual halo formation history. Although 
observational evidence for assembly bias remains elusive, we 
suggest a few indicators that should perform better for the dis¬ 
tinction between early- and late-forming halos, which could 
be obtained with high signal-to-noise spectra of central galax¬ 
ies (Section ISAb . 
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